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Abstract 

RNA molecules are highly modular components that can be used in a variety of contexts for building 
new metabolic, regulatory and genetic circuits in cells. The majority of synthetic RNA systems to date 
predominately rely on two-dimensional modularity. However, a better understanding and integration 
of three-dimensional RNA modularity at structural and functional levels is critical to the development 
of more complex, functional bio-systems and molecular machines for synthetic biology applications. 



Introduction 

In the broadest sense, synthetic biology attempts to 
understand and mimic biological systems in order to 
provide novel biologically inspired solutions for a variety 
of challenges, such as medicine, energy production and 
product manufacturing. RNAs, such as short interfering 
RNAs (siRNAs), aptamers, riboswitches and ribozymes, 
hold significant promise as modular components for 
developing regulatory genetic circuits and other biologi- 
cal tools for many synthetic biology applications [1-10]. 
As exemplified by complex cellular machineries, like the 
ribosome [11-13], RNase P RNAs [14,15], group I and 
group II introns [16-19] and the spliceosome [20,21], 
RNA is a material of choice for building complex, 
functional nano-architectures [22,23]. The number of 
reviews highlighting the remarkable progress achieved in 
RNA synthetic biology over the past few years points to 
this [24-28]. However, when compared to the variety, 
and structural and functional complexity of natural 
systems, RNA synthetic biology still has a tremendous 
way to go. 

A key structural atttibute of RNA relates to its inherent 
ability to form diverse tertiary interactions through non- 
canonical base-pairings [29]. In this regard, the biologi- 
cally relevant or active structure of an RNA molecule most 
often has three-dimensional implications. While signifi- 
cant gains have been made in RNA synthetic biology using 



nothing more than the knowledge of an RNA secondary 
structure as the primary determinant of activity in vitro 
[30-34], transition to the vastly more complex cellular 
context can still present unforeseen challenges for these 
same RNA moieties [35,36]. Some of this difficulty may 
stem from the construction of large artificial RNA-based 
systems and devices that lack a critical degree of structural 
and functional robustness. Thus, in our view, further 
developments in RNA synthetic biology complexity — 
including genetic regulatory elements, signaling devices, 
and molecular architectures — will depend on more 
focused efforts to understand and incorporate RNA 
structural principles at the tertiary level. Operating under 
this supposition, the following report is limited to more 
recent developments inspired by RNA nano-technology 
that offer opportunities to construct RNA devices or bio- 
systems containing significant increases in structural and 
functional complexity for use in RNA synthetic biology. 

Structural and functional parts from naturally 
occurring RNAs 

Similar to the other fundamental biomolecules, RNA 
is a hierarchical molecule containing multiple levels of 
modularity [37,38] (see Figure 1). The first and most basic 
layer of modularity at the chemical level concerns the four 
nucleotide building blocks themselves, which may be 
mixed and matched in any anangement to form primary 
sequences. At the next level, the formation of regular 
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Figure I . The multiple degrees of modularity in biological systems 
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As an example, RNAs are chemically, structurally and functionally modular. They can be integrated at the level of multiple metabolic, genetic and regulatory 
pathways that are themselves parts of subcellular components or cellular units. At a higher level of integration, RNA regulatory circuits are involved in 
the cellular modularity of multicellular organisms and in the developmental mechanisms leading to the specialization of individual organisms. Individual 
organisms can themselves be parts within colonies of eusocial species. 



Watson-Crick helices that define RNA secondary (2D) 
structures constitutes the most basic form of structural 
modularity. The utility of 2D structure modularity has 
been demonstrated through the fusion of various types 
of RNA aptamers to other functional RNA elements — 
providing the regulatory RNAs (i.e. siRNAs and micro- 
RNAs [miRNAs]) with the ability to selectively target 
specific cell types or allosterically respond to environ- 
mental metabolites [39-41]. This type of modularity gives 
RNA synthetic biology a "plug-and-play" quality that 
facilitates the design of modular composite devices. This 
important capability allows different sequence modules 
to be swapped in and out to tailor the specific activities 
encoded in a particular device without needing to 
redesign the linkages between the modules of the device 
each time to maintain functional activity. Although 
presently less utilized in RNA synthetic biology, the 
same type of modularity is found at the tertiary level of 
RNA structures. It is at this highest level that structural 
modularity involves the three-dimensional (3D) nature of 
RNAs and their tertiary motifs. 

Tertiary RNA motifs consist of highly conserved canoni- 
cal and non-canonical hydrogen bonding patterns 



between semi-conserved nucleotides. The majority of 
tertiary motif information to date has come from the 
structural data of large naturally occurring RNAs, like the 
ribosome, group I intron, and RNase P. Characterization 
of recurrent hydrogen bonding patterns has led to the 
identification of a variety of recurrent structural motifs, 
including small submotifs (e.g. the U-turn [42], A-minor 
and GA-minor motifs [37,43-44], the UAJiandle [38] 
and the ribose zipper [45]), terminal and internal loops 
[38,46-51], turns and junctions [37,38,52-59], long- 
range interactions [60-65] and pseudoknots [38,66-67]. 
A unifying characteristic of tertiary RNA motifs relates 
to their ability to also operate in a "plug-and-play" 
type fashion. Properly understood, RNA motifs can be 
swapped in and out of different sequence contexts and 
maintain their structural 3D identities [38,68]. 

Until recently, the use of tertiary RNA motifs (at least as 
far as it relates to synthetic biology) has been generally 
limited to the design of artificial RNA assemblies for the 
construction of RNA nano-structures, nano-particles 
(NPs) and/or scaffolds [5,6,43,68-73]. Because folded 
RNAs can be decomposed into smaller tertiary motifs, 
these structural building blocks can be used for 
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engineering artificial molecular units (tectoRNAs) able to 
self-assemble into large nano-structures [5,22,23,68]. 
This foundational approach, called RNA architectonics, 
was employed to generate in vitro (in the test tube) self- 
assembling RNA fdaments (ID), RNA planar arrays (2D) 
and RNA polyhedral NPs (3D) with precise control and 
positioning of functional components in 3D space (e.g. 
[5,10,23,68,72-77]). Such work has demonstrated that 
tertiary motifs, forming thermodynamically stable and 
well-defined topologies, can be isolated and inserted 
into a variety of artificial structural contexts without 
compromise. By unraveling the sequence-structure rela- 
tionship for RNA tertiary folds (e.g. [38,43,59]), the 
toolkit for rationally designing and constructing more 
complex and larger modular RNA assemblies is begin- 
ning to be established [78-80]. 

With respect to the more complex naturally occurring 
RNAs (i.e. the ribosome and RNase P), tertiary motifs are 
the core structural elements that facilitate folding and 
assembly as well as molecular recognition, and enzymatic 
and/or regulatory functions. In the case of smaller, less 
complex RNAs, the relationship between RNA structure 
and RNA function remains unchanged, in that structural 
modularity leads to functional modularity. For example, 
RNA takes advantage of a variety of structural contexts to 
regulate the expression of genes — ranging from ribos- 
witches to small anti- sense RNA regulators (Figure 1) 
[27,28]. This ability is greatly facilitated by functional 
modularity. While different RNA folds can have different 
functions, structurally different RNAs can share identical 
or similar regulatory functions. The usage of different 
structural modules with identical functions as well as the 
mixing and matching of different RNA functions is a key 
component of natural RNA biology. This same principle 
will certainly facilitate the design of novel metabolic and 
genetic regulatory pathways by customizing them to 
particular structural, genomic contexts and organisms 
(Figure 1). Additionally, the interchangeability of different 
functional RNA parts could also be used to unravel and 
discover new principles of functional equivalence 
between apparently distinct cellular operations [81,82]. 
Therefore, rather than being limited to generating new 
divergent synthetic pathways in cells, we anticipate that 
synthetic biology will also contribute to the functional 
convergence and modularity of molecular circuits and 
metabolic pathways at the origin of the buildup of cells 
and organisms [81,82]. 

RNA parts from directed selection and evolution 

The ability to generate novel RNAs with virtually any 
specific predetermined phenotype, using directed evolu- 
tion and in vitro selection, has been instrumental in RNA 
nano-technology and RNA synthetic biology [83,84]. In 



this regard, SELEX (systematic evolution of ligands by 
exponential enrichment) has become the strategy of 
choice for the directed evolution and selection of RNA 
having novel binding and/or catalytic properties [85-89]. 
From a structural perspective, in vitro selection offers the 
possibility of creating new structural motifs, not selected 
for in natural systems, including novel long-range 
interactions (e.g. [65,77,90-91]). The possibility of 
selecting new RNA interactions and shapes significantly 
increases our ability to create more complex nano- 
structures and nano-machines. One consequence of its 
success relates to the relative ease associated with 
generating new phenotypes compared to the time and 
effort it takes to thoroughly ascertain the structural 
characteristics of each new aptamer selected [92,93]. The 
ultimate goal of directed evolution strategies is often, at 
times, more concerned with generating a specific desired 
phenotype from a pool of sequences of rather short sizes 
than it is with characterizing the resulting RNA's unique 
structural features. 

Recently, Wittmann and Seuss [36] pointed out that, 
despite the number and diversity of RNA aptamers 
isolated to date [94,95], only a limited number of 
artificially selected aptamers have been successfully 
incorporated into useful riboswitch applications. They 
suggest that a majority of aptamers selected for in vitro lack 
the structural complexity necessary to function reliably in 
vivo. For example, when placed in the context of a 
regulatory element like a natural riboswitch, neomycin- 
binding aptamers screened in vivo have greater functional 
activity than those initially isolated in vitro [96,97]. 
Aptamers selected for in vivo regulation, like the tetra- 
cycline aptamer [36,98], tend to have increased structural 
complexity — allowing for larger conformational changes, 
higher binding-affinity with fast ligand binding and slow 
release, and greater thermal stability upon ligand binding 
[36]. It is possible that increased structural complexity 
may generate sequences that can support more interac- 
tions with the target ligand. This in turn could lead to 
higher binding affinity and/or increased binding specifi- 
city, which could explain the greater activity in vivo where 
ligand concentrations are likely to be more limited. 
Without knowing the precise cause, such findings, never- 
theless, suggest that the selective pressures present in vivo 
give rise to aptamers with greater structural complexity 
and overall increased robustness, which may in turn allow 
them to work more effectively in these same conditions. 

Structural studies on a variety of natural riboswitches 
suggest that long-range tertiary interactions are funda- 
mental to their functional activity [99-103]. The impor- 
tant contributions that long-range interactions make 
in riboswitches (with respect to functional activity) are 
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reminiscent of past research involving the minimal 
hammerhead ribozyme. Over a decade of research based 
on the minimal hammerhead sequence provided incon- 
clusive information on its active structure until the full- 
length ribozyme structure was discovered and its atomic 
structure solved, revealing a long-range interaction [104]. 
At first sight, isolation by in vivo SELEX of structurally 
complex aptamers might appear challenging but it is sure 
to offer greater potential for synthetic biology applications 
in comparison to many minimalist aptamers primarily 
selected for binding affinity in vitro. Furthermore, because 
evolution and selection often lead to the isolation of 
more than one RNA fold able to cany the same function, 
conducting selection experiments in vivo may increase the 
potential to produce multiple solution sets having the 
desired functional activity in vivo. Isolating multiple RNA 
solutions (each having functional modularity between the 
structurally distinct and unrelated 3D structures) promises 
to enhance the potential for building up more complex 
RNA molecules by providing additional structural choices 
among a specific type of function. 

The inherently modular nature of RNA has spawned the 
study of RNA structures generated from completely 
random nucleotide sequences devoid of any selection 
pressures. These "never born RNAs" investigate the 
sequence/structure space associated with random RNA 
sequences not tied to selection pressures (whether 
natural or unnatural) [105,106]. In addition to high- 
lighting how a particular structure can arise from many 
unrelated and different RNA sequences, "never born 
RNAs" could provide important insight into the ways in 
which the emergence of RNA structures are influenced by 
selection pressures — or the lack thereof. The use of 
structured "never born RNAs" as scaffolds could hold the 
promise of generating new devices with new or novel 
functionalities, but will most importantly contribute to 
unraveling the underlying sequence and structural 
constraints that are important in the selection and design 
of novel RNA functions for biological applications. 

RNA parts with enhanced biomolecular 
interoperability for greater complexity 

Future advancements in RNA synthetic biology will 
require greater interoperability between RNA and other 
types of materials. With respect to DNA and proteins, RNA 
offers distinct advantages. RNA is highly compatible with 
DNA in that it can form predictable base-pairings. In this 
regard, RNA can be used to form complex 2D circuits with 
itself and/or with DNA [107-109]. Furthermore, it can 
regulate and be naturally transcribed from DNA templates 
in vivo. Besides coding for proteins, RNA can co-operate 
with proteins to form ribonucleo-protein (RNP) com- 
plexes for regulation (i.e. in transcription and translation) 



and for building complex functional cellular machineries 
(i.e. the ribosome and RNase P). 

As the technologies and methods for the selection of 
artificial RNA aptamers targeting proteins continue to 
advance [110], so do their applications. RNA aptamers 
targeting specific proteins have been used for applications 
including controlled localization of RNA [111], visualiza- 
tion of cellular RNA [112], and directing metabolic 
pathways through the use of engineered RNA scaffolds 
[7]. On the other hand, several RNP complexes have been 
designed to develop responsive genetic switches and 
reprogram cellular behavior [113-118]. The elucidation of 
the binding rules between RNA and Pumilio and FBF 
homology protein (PUF) [119-121] and pentatricopep- 
tide repeats [122] offers interesting possibilities for the 
rational design of novel RNP constructs that can work in 
conjunction with one another [123]. 

Some natural non-coding RNAs, like DsrA, have the 
potential to form large RNA architectures within bacterial 
cells [124-126]. In the same manner, it was demonstrated 
that rationally designed RNA self-assembling nano- 
structures could promote the organization of intracellular 
reactions to produce an artificial hydrogen-producing 
pathway in bacteria (e.g. [7]). In view of these results and 
the potential of RNP complexes to generate self-assemblies 
[127], the development of novel intracellular RNP 
functions associated with subcellular self-assembling 
structures seems limidess for synthetic biology. 

Chemically modified RNA parts and synthetic 
ligands 

One of the areas where RNA synthetic biology has truly 
embraced its "synthetic" side is in the area of synthetic 
chemistry. Chemically modified RNA nucleotides and 
hybrids containing non-natural nucleic acids offer distinct 
possibilities in addition to their ability to increase an 
RNAs chemical and/or enzymatic stability. Non-natural 
nucleic acids (also referred to as XNA) have the potential 
to expand RNA's functional diversity as well as the ability 
to offer orthogonal systems capable of replication, 
heredity, and evolution [128-130]. As in vitro evolution 
and selection of novel functional XNAs has been recently 
demonstrated [129-131], XNAs open the door to the 
engineering of truly artificial living systems based on 
polymers other than DNA, RNA and proteins. Biocompa- 
tible XNAs containing reactive groups capable of under- 
going enzyme-free ligation reactions [132] and nucleic 
acids containing photo-responsive moieties [133] are 
other examples of expanded functionality. Additionally, 
unmodified RNAs have been selected for their ability to 
bind synthetic small molecules or ligands in order to 
induce fluorescence signals [8,134,135]. 
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As areas of RNA synthetic biology become more and 
more synthetic, structural and functional considerations 
of new synthetic parts will need to be continually 
re-investigated. For example, their effects on molecular 
recognition and assembly will determine the ways in 
which they can ensure seamless biocompatibility with 
existing cellular processes, such as self-assembly, regula- 
tion, and transcription processes. 

The rise of nano-machines and RNA 
synthetic biology 

The engineering of complex molecular nano-machines 
offers us the prospect of being able to modify, repair and/ 
or control cellular operations for various therapeutic 
purposes. Their development may also increase the 
present toolkit of molecular biology and biochemistry 
for circularizing modifying or synthesizing RNA and 
novel informational polymers. Recently, DNA self-assem- 
bly was used for engineering several mechanical devices, 
artificial nano-machines and assembly lines (e.g. [136- 
140]). However, these DNA nano-machines are still far 
from the remarkable efficiency and complexity of natural 
cellular machines, which essentially rely on RNA and 
proteins. Presently, one strategy for building nano- 
machines consists of deriving new functionalities from 
existing RNA machines like the ribosome. The creation of 
orthogonal ribosomes, operated by an expanded genetic 
code and using four codons per amino acid represents one 
of these seminal achievements for synthetic biology [141- 
143]. Other strategies take advantage of directed evolution 
and in vitro selection to isolate new complex artificial 
ribozymes from random RNA libraries (e.g. [144-146]), or 
modular RNA libraries that consist of a pre-existing 
functional domain to which random loops are appended 
(e.g. [85,147-151]). While a great deal of novel functional 
RNAs have been isolated by directed evolution and in vitro 
selection from combinatorial libraries of 30 nts to 200 nts 
(e.g. [85-87]), their structural and functional complexity 
are still far from that observed in nature (Figure 2). 
Presently, one of the most advanced nano-machines 
selected is a 200 nt RNA polymerase ribozyme (tC19z) 
with enhanced polymerase activity and fidelity with 
respect to previous class I ribozymes from which it was 
derived [150,152,153] (Figure 2). However, the resulting 
tC19z ribozyme is still partially template sequence- 
dependent and relatively slow (RNA polymerization 
reactions occur over several days [152]), in contrast to 
those catalyzed by RNA polymerase proteins that require 
only minutes to copy much longer templates. The 
question is whether nano-machines with the functional 
and structural complexity of large natural ribozymes can 
be developed in the laboratory. If they can, it will be a 
significant milestone paving the way to complex nano- 
factories with great potential for synthetic biology. 



Figure 2. Structural complexity of natural and artificial ribozymes 
and RNA nano-structures in function of sequence length 
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A reasonable estimate of the two-dimensional (2D) structural complexity of a 
folded RNA is its number of constituent Watson-Crick helices. Note that 
most natural ribozymes and the 1 6S ribosomal RNA (rRNA) are aligned. 
in vitro selected ribozymes are circled in orange. Class I ligases are the 
most complex ribozymes originating from purely random sequences 
[1 46, 1 50, 1 52, 1 S3]. The red star indicates the most complex RNA ligase 
isolated by SELEX from a partially random library based on a natural 
structural scaffold [147,15 1]. Diamonds indicate modular nano-structures 
(reported in [68,72-73]). 



As the functional complexity of natural molecular 
machines is proportional to their structural complexity 
(Figure 2), we hypothesize that the cunent knowledge in 
RNA nano-structure design will provide a foundation 
for building larger artificial nano-machines that approach 
the functional complexity of natural ones. Because of 
technical limitations associated with synthesizing large 
random RNA pools and RNA precipitation, functional 
parts isolated from purely random pools is limited to less 
than 200 nts regions (e.g. [144-146]). With the RNA 
architectonics approach, it is possible to engineer much 
larger RNA nano-structures or modular parts that can 
be used as scaffolds for pre-orienting and positioning 
random loops in 3D space, thereby allowing isolation of 
functional modules within large structural contexts that 
are not accessible from purely random pools (e.g. [147- 
150]. Thus, we anticipate that combining RNA 
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architectonics with directed evolution and in vitro selec- 
tion would allow new modular ribozymes with structural 
complexity comparable to those of large ribozymes (i.e. 
group I and group II introns, RNase P) to be isolated. 
Additionally, selection pressure could be applied to select 
directional moving parts with functional modules to work 
in a concerted fashion to achieve a particular functional 
task. At the present time, the de novo development of 
complex nano-machines that operate in cells is exciting 
but essentially uncharted territory in synthetic biology. 

Prospects and questions 

Recent advances in RNA synthetic biology present 
fascinating possibilities but also raise a number of 
intriguing questions. For example, to what degree is 
structural complexity required for implementing com- 
plex cellular behaviors? It is true that rather simple 
molecules like miRNAs, siRNAs, or antisense RNA — 
having limited complexity at the part level — offer 
enough control to enable a great variety of different 
cellular behaviors. At the same time it is also clear, from 
the vantage point of both artificial and natural RNAs, 
that increased structural complexity has its advantages in 
certain cases. This seems to be particularly true when it 
comes to creating RNAs with interesting chemistry (as in 
the case of the ribosome) or selective RNAs having 
multiple functionalities (as in the case of riboswitches 
that require specific binding and allosteric regulatory 
properties) . 

The identification and characterization of large non- 
coding RNAs (IncRNAs) presents a new and emerging 
frontier in which to explore the correlation between 
structural and functional complexity further. Recent 
profiling of the secondary structure of some IncRNAs, 
on the order of several hundred nucleotides, suggests 
that they fold into complex highly ordered conforma- 
tions [154-157]. While the degree to which IncRNAs 
form RNP complexes remains to be seen, at least three 
possible structural scenarios regarding their possible 
interactions have been put forth [158]. It would seem 
that whether IncRNAs exist as compact cores with largely 
peripheral protein binding sites, as relatively unstruc- 
tured RNAs with loosely organized protein binding 
domains, or as RNAs lacking a central core but with 
ordered protein binding sites would be the result of very 
different and distinct folding and structural principles. 
For instance, compact 3D RNAs would most likely need 
to rely on and be optimized for long-range tertiary 
interactions while the other two would be optimized to 
avoid long-range tertiary interactions. Uncovering some 
of these fundamental characteristics regarding these 
types of structures could provide further insights into 
the design of synthetic RNP particles [71,123] and/or 



reveal unknown functionalities. Ultimately, such find- 
ings have the potential to advance our understanding 
of modern biology as well as provide new tools and 
strategies for RNA synthetic biology. 
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